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Abstract 

Background: Esophageal squamous cell carcinoma (ESCC) has the highest mortality rates in China. The 5-year survival rate 
of ESCC remains dismal despite improvements in treatments such as surgical resection and adjuvant chemoradiation, and 
current clinical staging approaches are limited in their ability to effectively stratify patients for treatment options. The aim of 
the present study, therefore, was to develop an immunohistochemistry-based prognostic model to improve clinical risk 
assessment for patients with ESCC. 

Methods: VJe developed a molecular prognostic model based on the combined expression of axis of epidermal growth 
factor receptor (EGFR), phosphorylated Specificity protein 1 (p-Spl), and Fascin proteins. The presence of this prognostic 
model and associated clinical outcomes were analyzed for 130 formalin-fixed, paraffin-embedded esophageal curative 
resection specimens (generation dataset) and validated using an independent cohort of 185 specimens (validation dataset). 

Resu/ts:The expression of these three genes at the protein level was used to build a molecular prognostic model that was 
highly predictive of ESCC survival in both generation and validation datasets (P = 0.001). Regression analysis showed that 
this molecular prognostic model was strongly and independently predictive of overall survival (hazard ratio = 2.358 [95% CI, 
1.391-3.996], P = 0.001 in generation dataset; hazard ratio = 1.990 [95% CI, 1.256-3.154], P = 0.003 in validation dataset). 
Furthermore, the predictive ability of these 3 biomarkers in combination was more robust than that of each individual 
biomarker. 

Conclusions: This technically simple immunohistochemistry-based molecular model accurately predicts ESCC patient 
survival and thus could serve as a complement to current clinical risk stratification approaches. 
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Introduction 

Among all types of cancer, esophageal cancer (EC) has the 
eighth and sixth highest incidence and mortahty rates worldwide, 
respectively [1]. Although esophageal adenocarcinoma (EAC) has 
become the predominant histological subtype in some western 
countries, esophageal squamous cell carcinoma (ESCC) remains 
dominant in China, with almost 90% of newly diagnosed patients 
exhibiting this cancer subtype [2]. The 5-year survival rate for 
ESCC remains dismal, despite improvements in treatments such 
as surgical resection and adjuvant chemoradiation. In current 
clinical practice, pathological tumor-node-metastasis (pTNM) 
stage is considered the optimal prognostic indicator. However, 



this clinical staging approach is limited in its ability to precisely 
stratify patients for treatment options due to wide variation in 
survival rates, such as that observed among T3N1 patients [3]. 
Clearly, identifying effective biomarkers to complement current 
clinical staging approaches is highly important. According to 
national guidelines [4,5], biomarkers should be sensitive, specific, 
cost-effective, fast, robust against variability, and more accurate 
than current clinical stages. A single biomarker, however, may be 
unlikely to fulfill all of these requirements. 

In recent decades, the identification of combinations of 
biomarkers instead of single biomarkers has become a popular 
research endeavor. IVlulti-gene signatures of breast cancer, 
colorectal cancer, esophageal and gastroesophageal junction 
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adenocarcinoma, and other cancer types have served as successful 
prognostic indicators [3,6-9] . The ability of these gene signatures 
to accurately predict survival provides a foundation on which to 
build molecular classification systems and individualized treatment 
approaches. To date, however, the application of molecular 
prognostic signatures is less advanced for ESCC than for other 
cancer subtypes. 

In a previous study, we showed that ESCC was associated with 
the overexpression of Fascin, which was regulated by phosphor- 
ylated Specificity protein 1 (p-Spl) via activation of the epidermal 
growth factor (EGF)/extracellular signal-regulated kinase (ERK) 
signaling pathway [10]. Although the clinical significance of this 
pathway remains unclear, the EOF receptor (EGFR), a trans- 
membrane glycoprotein belonging to the HER family of receptors, 
is recognized as a negative prognostic indicator [11,12] and has 
shown clinical relevance as a molecular target of cancer therapies 
[13,14]. Fascin, an actin bundling protein, is also recognized as a 
prognostic indicator, with its overexpression associated with 
aggressive chnical phenotypes and poor survival [15-17]. Based 
on the clinical significance of EGFR and Fascin, we hypothesized 
that a combination of molecules from the EGFR/ERK/Fascin 
signahng pathway could accurately predict cancer outcome. 
Indeed, we found that a three-gene signature comprised of 
expression of EGFR, p-Spl, and Fascin proteins independentiy 
predicted ESCC patient survival. This molecular prognostic model 
could give rise to a new molecular stratification system and provide 
a useful framework for future work on prognostic signatures for 
ESCC and other cancers. 

Materials and Methods 

Patients and specimens 

Paraffin-embedded tissues were derived from two independent 
cohorts of ESCC patients undergoing curative resection at 
Shantou Central Hospital between 2007 and 2009 (generation 
dataset, n= 130) or between 1987 and 1997 (validation dataset, 
n = 185). Patients in the generation dataset were followed up for a 
median time period of 35.0 months, with follow-ups terminated on 
November 9, 2012. Patients in the validation dataset were 
followed up for a median and maximum time period of 33.6 
and 131.3 months, respectively. Overall survival rate (OS) was 
calculated during the period between surgery and death or final 
observation. Information on patient age, gender, stage of disease, 
therapy, and histopathology was obtained from medical records 
(Table 1). The study was approved by the ethical committee of the 
Central Hospital of Shantou City and the ethical committee of the 
Medical Colk^gi; of Shantou Uni\'<;rsity, and written informed 
consent was obtained from all surgical patients to use resected 
samples for research. 

Tissue microarrays (TMAs) and imnnunohistochemistry 
(IHC) 

TMAs were constructed as previously described [17-19]. The 
primary antibodies used in this study were mouse anti-EGFR 
(ready-to-use; ZSGB-BIO, Beijing, China), rabbit anti-Spl(pho- 
spho T453, 1:100 dilution; Abeam, Cambridge, UK), and mouse 
anti-human Fascin- 1 (clone 55K-2, 1:100 dilution; Dako, Carpin- 
teria, CA). IHC was carried out using a two-step protocol (PV- 
9000 Polymer Detection System, ZSGB-BIO, Beijing, China) as 
previously described [19]. 

Evaluation of IHC variables 

Tissue sections were independently and blindly assessed by 
three histopathologists (Cao HH, Wang SH, and Shen JH). 



Discrepancies were resolved by consensus. The EGFR expression 
was scored using the HercepTest criterion [20]. EGFR scoring 
criteria: 0 corresponded to no staining at all, or membrane staining 
in less than 10% of the tumour cells was observed, 1 + 
corresponded to a faint/barely perceptible membrane staining 
was detected in more than 10% of the tumour cells. The cells were 
only stained in part of their membrane, 2-1- corresponded to a weak 
to moderate staining of the entire membrane was observed in 
more than 10% of the tumour cells and 3-1- was a strong staining of 
the entire membrane was observed in more than 10% of the 
tumour cells. EGFR staining was predominantly located in the cell 
membrane, cytoplasmic staining was considered non-specific and 
not included in the scoring. For statistical analysis, we divided 
EGFR scores into two groups; scores of 0-2-H were considered low- 
expression and scores of 3-1- were considered high-expression. 

Fascin expression was assessed by staining of cell cytoplasm. Its 
expression was scored as described by Zhao et al." Each separate 
tissue core was scored on the basis of the intensity and area of 
positive staining. The intensity of positive staining was scored as 
follows: 0, negative; 1, weak staining; 2, moderate staining; 3, 
strong staining. The rate of positive cells was scored on a 0-4 scale 
as foUows: 0, 0-5%; 1, 6-25%; 2, 26-50%; 3, 51-75%; 4, >75%. 
If the positive staining was homogeneous, a final score was 
achieved by multiplication of the two scores, producing a total 
range of 0-12. When the staining was heterogeneous, we scored it 
as follows: each component was scored independently and 
summed for the results. For example, a specimen containing 
25% tumor cells with moderate intensity (1x2 = 2), 25% tumor 
cells with weak intensity (1x1 = 1), and 50'/o tumor cells without 
immunoreactivity (2x0 = 0), received a final score of 2-h1-h0 = 3. 
For statistical analysis, we divided Fascin scores into two groups; 
scores of 0-10 were considered low-expression and scores of more 
than 10 were considerd high-expression. 

p-Spl expression was assessed by staining of cell nuclei. 
Cytoplasmic staining was considered non-specific and not included 
in the scoring. p-Spl expression levels were scored on a scale 
ranging from 0 to 3H-: 0 indicated no positive staining; 1-1- indicated 
only a few scattered stained cells or weak staining in less than 30% 
of cells within a visual field; 2+ indicated cluster(s) of moderate to 
strong staining in less than 30%) of cells or weak staining in more 
than 30°/() of cells; 3+ indicated cluster(s) of moderate to strong 
staining in more than 30"/) of cells. For statistical analysis, we 
divided p-Spl scores into two groups; scores of 0-2-1- were 
considered low-expression, and scores of 3-1- were considered high- 
expression. 

Construction of a weighted OS predictive model 

Cox proportional hazards regression analysis was used to 
evaluate the association between biomarker expression and OS. 
We then constructed a model to estimate risk by summing the 
expression level of each biomarker (high-expression = 1 , low- 
expression = 0) multiplied by its regression coefficient [21-23]. 
Patients were dichotomized into high- or low-risk groups using the 
50th percentile (i.e., median) risk score as a cut-off value. 

Statistical analysis 

Statistical analyses were performed using SPSS 13.0 for 
Windows. Cumulative survival time was calculated by the 
Kaplan-Meier method and analysed by the log-rank test. Spear- 
man's two-sided rank correlation was used to explore the 
correlation levels between three proteins expression. Univariate 
and multivariate analyses were based on the Cox proportional 
hazards regression model. Receiver operating characteristic 
(ROC) curve analysis was used to determine the predictive value 
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Table 1. The cllnicopathological characteristics of two datasets of patients with ESCC. 




Clinical and pathological indexes 


Generation dataset 


Validation dataset 




No. 


% 


No. 


% 


Specimens 


130 




185 




Mean age 


59 




58 




Age (year) 


< Mean age 


70 


53.8 


87 


47.0 


^ Mean age 


60 


46.2 


98 


53.0 


Gender 


Male 


103 


79.2 


140 


75.7 


Female 


27 


20.8 


45 


24.3 


Differentiation 


G1 


21 


16.2 


44 


23.8 


G2 


97 


74.6 


111 


60.0 


G3 


12 


9.2 


30 


16.2 


T-stage 


T1+T2 


17 


13.1 


32 


17.3 


T3+T4 


113 


86.9 


153 


82.7 


N-stage 


NO 


63 


48.5 


122 


65.9 


Nl 


67 


51.5 


63 


34.1 


M-stage 


MO 


130 


100 


178 


96.2 


Ml 


0 


0 


7 


3.8 


pTNM-stage 


lA+IB+IIA+IIB 


68 


52.3 


125 


67.6 


IIIA+IIIB+IIIC+IV 


62 


47.7 


60 


32.4 


Therapy 


Only Surgery 


84 


64.6 


119 


64.3 


Surgery + chemo 


19 


14.6 


39 


21.1 


Surgery + radio 


25 


19.2 


20 


10.8 


Surgery + chemo + radio* 


2 


1.6 


7 


3.8 


*, chemo, chemotherapy; radio, radiotherapy. 
doi:1 0.1 371/journal.pone.0106007.t001 










f the parameters, and the differences in the 


area under the curve 


same as other reports in 


ESCC, 


while no report of Spl in ESCC 



(AUC) were detected by using GraphPad Prism 5. The KendaU 
tau-b rank correlation analysis was used to evaluate the association 
between the expression of the prognostic model and cKnicopath- 
ological factors. P value less than 0.05 was considered statistically 
significant. 

Results 

IHC characteristics of EGFR, p-Spl, and Fascin biomarkers 

Three potential biomarkers from the EGFR/ERK/Fascin 
signaling pathway were stained using IHC. EGFR and p-Spl 
staining were mainly observed in cell membranes and nuclei, 
respectively, whereas Fascin staining was more diffuse throughout 
the cytoplasm. Representative images of different staining scores 
are shown in Figure 1 . However, positive staining of EGFR and 
Fascin was apparent only in basal layer of epithelium tissue 
adjacent to carcinoma, while p-Spl was weak staining in higher 
granular layer of the epithelium (Figure SI). Our results were the 



[24,25]. 

Correlations between the three biomarkers 

In both the generation dataset and the validation dataset, the 
Spearman's rank correlation showed that the expression of EGFR 
was closely associated with the Fascin expression (r = 0.299, 
P = 0.001 and r = 0. 154, P = 0.037), whUe no correlation between 
EGFR and p-Spl or between p-Spl and Fascin. Detail 
information was in Figure S2. 

Prognostic significance of EGFR, p-Spl, and Fascin 
expression and other clinical/pathological characteristics 

In the generation dataset, the 1- and 3-year OS were 83.1% and 
57.5%, respectively. In the validation dataset, the 1-, 3-, and 5- 
year OS were 93.5%, 62.4%, and 50%, respectively. Univariate 
analysis revealed that the three biomarkers (EGFR, p-Spl, and 
Fascin), as well as four pathological factors (DiflFerentiation [G3 vs. 
Gl], N-stage, M-stage, and pTNM-stage), were significandy 
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associated with OS (Table 2). However, EGFR did not signifi- 
cantly predict OS in the generation dataset, perhaps due to 
heterogeneity in EGFR expression patterns between the two 
datasets. Kaplan-Meier analysis provided further support that 
EGFR, p-Spl, and Fascin were significant predictors of OS in 
both generation and validation datasets, except for EGFR in the 
validation dataset (Figure S3). In the generation dataset, the 3-year 
OS was significantly lower for the p-Spl and Fascin high- 
expression groups than the low-expression groups. In the 
validation dataset, the 3- and 5-year OS were significantly lower 
for the EGFR, p-Spl, and Fascin high-expression groups than the 
low-expression groups. 

Predictive molecular prognostic model 

Our molecular prognostic model was Calculated as 
Y = (Pi)x(EGFR)+(P2)x(p-Spl)+(P3)x(Fascin), with Y equal to 
risk score and P„ equal to each gene's coefficient value from 
univariate Cox proportional hazards regression analysis. In the 
generation dataset, Pi =0.141, = 0.736, and P3 = 0.559. In the 
validation dataset. Pi = 0.479, P2 = 0.514, and P3 = 0.543. Patients 
were ranked and divided into high- and low-risk groups using the 
50th percentile (i.e., median) risk score as the cut-off value. 

In the generation dataset, the 3-year OS for the high-risk group 
was significantly lower than that for the low-risk group (73.6% vs. 
43.3%; Figure 2A). Similar results were found in the validation 
dataset, that the 3- and 5-year OS for the high-risk group were 
significantiy lower than those for the low-risk group (73.6% and 
61.8% vs. 51.4% and 37.2%, respectively; Figure 2A). Multivar- 
iate Cox proportional hazards regression analysis showed that the 
three-gene signature, along with pTNM-stage, was a strong and 
independent predictor of OS (Table 2). 



Predictive power of the molecular prognostic model 

In both the generation and validation datasets, receiver 
operating characteristic (ROC) analysis showed that the predictive 
power of the prognostic model was higher than that for each 
biomarker individually. In the generation dataset, specificity and 
sensitivity were 66.7% and 59.7%, respectively, and area under 
the curve (AUG) for OS with 95% CI was 0.632. Similar results 
were found in the validation dataset, with 62% specificity, 55.7% 
sensitivity, and 0.588 AUG (Figure 2B). Furthermore, in the 
generation dataset, the predictive ability of the prognostic model 
was not only higher than that of EGFR, p-Spl, and Fascin 
individually but also higher than all clinical/pathological charac- 
teristics. However, in the validation dataset, the AUG for the 
prognostic model was not larger than that for N-stage and pTNM- 
stage, but specificity and sensitivity were optimal (Figure S4). 

Correlations between the prognostic model and clinical/ 
pathological characteristics 

Kendall tau-b correlation analysis indicated that the prognostic 
model was significantly related to N-stage (Table 3). In the 
generation dataset, the proportion of high-risk in patients suffering 
regional lymph node metastasis (Nl) were significantly higher than 
that of high-risk in patients without regional lymph node 
metastasis (NO) (62.7% [42/67] vs. 42.9% [27/63], P = 0.035). 
Similar results were obtained in the validation dataset (63.5% [40/ 
63] vs. 45.9% [56/122], P = 0.030). Other clinical/pathological 
characteristics such as age, gender, difiFerentiatioii, T-stage, M- 
stage, pTNM-stage, and therapy, however, were not significantiy 
different between high-risk and low-risk patients. 

Combination of the prognostic model and N-stage 

As our results indicate that both the prognostic model and N- 
stage are involved in ESCC prognosis, we next considered these 
characteristics together. Patients were subdivided into four 
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A Generation dataset 




0.0 0.2 0.4 0.6 0.8 1.0 
1 -Specificity 



Validation dataset 




0.0 0.2 0.4 0.6 0.8 1.0 
1- Specificity 



EGFR- 



EGFR- 



p-Spl- I- 



Fasdn- 



Fascin - 



prognflslic modd 



pn^osttc model 



Ol4 0.5 0.6 



— I — 
0.7 



— I — 
0.8 



— I — 
<Il9 



— I 
1.0 



AUC for OS 



<L4 as 0l6 <L7 <L8 

AUC for OS 



0l9 LO 



Figure 2. Predictive ability of tlie molecular prognostic model. A, Kaplan-Meier analysis of OS for low-risk and high-risk ESCC patients based 
on expression of the molecular prognostic model in generation and validation datasets. B, Predictive ability of the molecular prognostic model 
compared with individual biomarker shown by receiver operating characteristic (ROC) curves and area under the curve (AUC) in generation and 
validation datasets. 

doi:1 0.1 371 /journal.pone.01 06007.g002 



subgroups: NO-Hlow-risk, NO-l-higli-risk, Nl-Hlow-risk, and N\ + 
high-risk. Nl-Hhigh-risk patients had the poorest prognoses, 
whereas the other three groups showed no notable differences in 
prognoses (data not shown); therefore, these three groups were 
merged into a single group. Kaplan-Meier curves showed 
significant differences in OS between the two groups (Figure 3). 



In the generation dataset, the 3-year OS was 25.6% for the Nl-H 
high-risk group compared with 72.7% for the other group. In the 
validation dataset, the 3- and 5-year OS were 42.5% and 26.5%, 
respectively, for the Nl-Hhigh-risk group, compared with 67.9% 
and 56.3% for the other group. 
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Table 3. The correlation between molecular prognostic model and clinicopathological characteristics in ESCC. 





Variables 


Generation dataset 


P' 


Validation dataset 


P* 




Low-risk 


Higli-risl( 




Low-risk 


Higli-risk 




Age (year) 


< Mean age 


35 


35 


0.484 


41 


46 


0.883 


^ Mean age 


26 


34 




48 


50 




Gender 


Male 


44 


59 


0.083 


68 


72 


0.865 


Female 


17 


10 




21 


24 




Differentiation 


Gl 


9 


12 


0.465 


23 


21 


0.905 


G2 


45 


52 




49 


62 




G3 


7 


5 




17 


13 




T-stage 


T1+T2 


4 


13 


0.066 


15 


17 


1.000 


T3+T4 


57 


56 




74 


79 




N-stage 


NO 


36 


27 


0.035 


66 


56 


0.030 


N1 


25 


42 




23 


40 




M-stage 


MO 


61 


69 




86 


92 


1.000 


Ml 


0 


0 




3 


4 




pTN M-stage 


lA+IB-HIA-l-iiB 


37 


31 


0.081 


65 


60 


0.157 


IIIA+IIIIB-I-IIIC-I-IV 


24 


38 




24 


36 




Therapy 


Only Surgery 


38 


46 


0.714 


57 


62 


1.000 


Comprehensive Therapy^ 


23 


23 




32 


34 





The Kendall's tall-b test; 

^, Comprehensive Therapy including Surgery + chemotherapy, Surgery + radiotherapy and Surgery + chemotherapy + radiotherapy. 
doi:l 0.1 371/journal.pone.0106007.t003 



Generation dataset 



B 



1.0 

0.8 



0.6 

> 



B 0.4 
O 0.2 
0.0 



NO+Low/High-risk and 
N1+ Low-nsk (n=88) 



Nl+High-risk(n=42) 



i^O.OOO 



T 1 1 1 1 1 r 

0.0 10.0 20.0 30.0 40.0 50.0 60.0 
Months 



1.0 

0.8 1 



13 0.6 - 
> 



g 0.4 
^ 0.2 
0.0 




Validation dataset 



NO+Low/High-risk and 
\ N 1+ Low-nsk (n=l 45) 

^"^IBI ■ ■ n ■ ~ -4^ 4* ^ 



Nl+High-risk (n=40) 



jR=0.000 



I 1 1 1 1 1 1 r 

0.0 20.0 40.0 60.0 80.0 100.0 120.0 140.0 
Months 



Figure 3. Kaplan-Meier analyses of OS considering a molecular prognostic model and N-stage in generation and validation 
datasets. 

doi:1 0.1 371 /journal.pone.01 06007.g003 
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Discussion 

Although many prospective studies have assessed potential 
biomarkers of cancer using high-throughput screening techniques 
[21-23], there is often little to no biological connection among the 
individual biomarkers. Furthermore, single biomarker predictor 
models often have limited power to predict cancer patient survival 
[26—28]. Therefore, the three-gene signature discovered in the 
present study, which is comprised of molecules within the EGFR/ 
ERK/Fascin signaling pathway, may represent a useful preclinical 
model for improving ESCC treatment and clinical outcome. Using 
two independent cohorts of ESCC patients, our study both 
generates and validates this molecular prognostic model, which 
predicts poor prognosis. We investigated our molecular prognostic 
model at the protein level for two reasons. First, formalin-fixed, 
parafiin-embedded tissue is far more available than other types of 
samples such as fresh-frozen tissue. Second, IHC is technically 
simple, fast, economical, clinical applicable, and robust, in contrast 
to assessments of gene expression at the mRNA level, which 
require standardization of techniques to allow comparison of data 
across days, laboratories, and types of samples [3]. 

This prognostic model made it possible to identify a cohort of 
ESCC patients with a 5-year survival of 52%, which is remarkable 
for this disease. Combining the prognostic model with N-stage, we 
found that Nl+high-risk patients had the poorest clinical outcome, 
whereas Nl+low-risk and NO+high/low-risk patients had similar 
prognoses. This result, while surprising, could serve to guide 
treatment options. That is, Nl and high-risk patients may urgendy 
require therapeutic intervention to improve their prognosis. 
EGFR is a particularly promising molecular target of therapy, as 
EGFR inhibitors have been widely applied to a variety of solid 
tumors, such as lung cancer [13,14], colorectal cancer [29], breast 
cancer [30], and even ESCC [31]. Some of these therapeutic 
strategies have been subject to clinical trials, with four EGFR 
inhibitors currendy approved by the US Food and Drug 
Administration, including gefitinib, erlotinib, cetuximab, and, 
most recently, panitumumab. Therefore, the poor clinical 
outcome of Nl+high-risk patients might be improved by a more 
comprehensive treatment approach, such as chemotherapy or 
radiotherapy combined with cetuximab treatment. In addition to 
EGFR, Fascin is also recognized as a therapeutic target [32], as 
binding with migrastatin analogues inhibits Fascin activity and 
blocks tumor metastasis [33]. Our prognostic model could 
therefore lead to new avenues of therapy for patients with ESCC, 
such as treatment with EGFR and/ or Fascin inhibitors. 

Besides, in the past, patients once diagnosed with lymphatic 
metastasis received several simultaneous treatments in an unse- 
lective manner. However, such overtreatment often fails to 
improve prognosis and leads to a massive waste of medical 
resources. Our results also suggest that Nl+low-risk patients could 
be treated the same as lymphonodus-negative patients. Therefore, 
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